Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test: bwc test for text chunking processor #661

Merged
merged 29 commits into from
Apr 17, 2024

Conversation

yuye-aws
Copy link
Member

@yuye-aws yuye-aws commented Apr 2, 2024

Description

Implement backward compatibility test for text chunking processor : #607

Issues Resolved

BWC test issue: #647

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Apr 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.19%. Comparing base (cc6a6b2) to head (bea5111).
Report is 6 commits behind head on main.

❗ Current head bea5111 differs from pull request most recent head 9b2334e. Consider uploading reports for the commit 9b2334e to get more accurate results

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #661      +/-   ##
============================================
+ Coverage     84.04%   84.19%   +0.14%     
+ Complexity      744      743       -1     
============================================
  Files            59       59              
  Lines          2313     2309       -4     
  Branches        374      370       -4     
============================================
  Hits           1944     1944              
  Misses          214      214              
+ Partials        155      151       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yuye-aws yuye-aws marked this pull request as draft April 3, 2024 07:08
@yuye-aws yuye-aws marked this pull request as ready for review April 3, 2024 08:03
@vibrantvarun
Copy link
Member

@yuye-aws Can you resolve the merge conflict? I will review this PR tonight

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
Signed-off-by: yuye-aws <yuyezhu@amazon.com>
@yuye-aws
Copy link
Member Author

Have an interesting observation. The text chunking BWC test runs successfully when I create the index with 3 shards. However, the test get failed when I reduce the shard number from 3 to 1. You can refer to file ChunkingIndexSettings.json in my BWC PR for more details.

@yuye-aws
Copy link
Member Author

Have an interesting observation. The text chunking BWC test runs successfully when I create the index with 3 shards. However, the test get failed when I reduce the shard number from 3 to 1. You can refer to file ChunkingIndexSettings.json in my BWC PR for more details.

To be specific, the error is due to cannot find chunking index. It is weird that the index has been explicitly created before.

@@ -130,4 +132,11 @@ protected void createPipelineForSparseEncodingProcessor(String modelId, String p
);
createPipelineProcessor(requestBody, pipelineName, modelId);
}

protected void createPipelineForTextChunkingProcessor(String pipelineName) throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, maybe we can have a utility class for common methods like this

@zane-neo
Copy link
Collaborator

Have an interesting observation. The text chunking BWC test runs successfully when I create the index with 3 shards. However, the test get failed when I reduce the shard number from 3 to 1. You can refer to file ChunkingIndexSettings.json in my BWC PR for more details.

To be specific, the error is due to cannot find chunking index. It is weird that the index has been explicitly created before.

Is there any assertion to confirm the index creation is successfully? Also check if other BWC tests has the same issue? Please identify the root cause of this since it could be a bug in test framework code.

@yuye-aws
Copy link
Member Author

yuye-aws commented Apr 11, 2024

Have an interesting observation. The text chunking BWC test runs successfully when I create the index with 3 shards. However, the test get failed when I reduce the shard number from 3 to 1. You can refer to file ChunkingIndexSettings.json in my BWC PR for more details.

To be specific, the error is due to cannot find chunking index. It is weird that the index has been explicitly created before.

Is there any assertion to confirm the index creation is successfully? Also check if other BWC tests has the same issue? Please identify the root cause of this since it could be a bug in test framework code.

Created two sample PRs to test the Bug: #684 #685. In the first PR, the only difference is the index setting in file ChunkingIndexSettings.json. In the second PR, I have removed the shard number and replica setting from existing BWC test indices.

@yuye-aws
Copy link
Member Author

Is there any assertion to confirm the index creation is successfully?

For index creation, there exists assertion in function createIndexWithConfiguration in BaseNeuralSearchIT.java.

@yuye-aws
Copy link
Member Author

Here is a snapshot of the error log in: #684

REPRODUCE WITH: ./gradlew ':qa:rolling-upgrade:testAgainstOldCluster' --tests "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.testTextChunkingProcessor_E2EFlow" -Dtests.seed=9F3C01C234B5A374 -Dtests.security.manager=false -Dtests.bwc.version=2.14.0-SNAPSHOT -Dtests.locale=en -Dtests.timezone=Kwajalein -Druntime.java=21

org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT > testTextChunkingProcessor_E2EFlow FAILED
    org.opensearch.client.ResponseException: method [PUT], host [http://[::1]:40465], URI [/neuralsearch-bwc-testtextchunkingprocessor_e2eflow/_doc/0?refresh=true], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"}],"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"},"status":404}
        at __randomizedtesting.SeedInfo.seed([9F3C01C234B5A374:52D0CFE8390977E0]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:385)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:355)

@zane-neo
Copy link
Collaborator

Here is a snapshot of the error log in: #684

REPRODUCE WITH: ./gradlew ':qa:rolling-upgrade:testAgainstOldCluster' --tests "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.testTextChunkingProcessor_E2EFlow" -Dtests.seed=9F3C01C234B5A374 -Dtests.security.manager=false -Dtests.bwc.version=2.14.0-SNAPSHOT -Dtests.locale=en -Dtests.timezone=Kwajalein -Druntime.java=21

org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT > testTextChunkingProcessor_E2EFlow FAILED
    org.opensearch.client.ResponseException: method [PUT], host [http://[::1]:40465], URI [/neuralsearch-bwc-testtextchunkingprocessor_e2eflow/_doc/0?refresh=true], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"}],"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"},"status":404}
        at __randomizedtesting.SeedInfo.seed([9F3C01C234B5A374:52D0CFE8390977E0]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:385)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:355)

Both failing with this index_not_found_exception?

@yuye-aws
Copy link
Member Author

Here is a snapshot of the error log in: #684

REPRODUCE WITH: ./gradlew ':qa:rolling-upgrade:testAgainstOldCluster' --tests "org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT.testTextChunkingProcessor_E2EFlow" -Dtests.seed=9F3C01C234B5A374 -Dtests.security.manager=false -Dtests.bwc.version=2.14.0-SNAPSHOT -Dtests.locale=en -Dtests.timezone=Kwajalein -Druntime.java=21

org.opensearch.neuralsearch.bwc.TextChunkingProcessorIT > testTextChunkingProcessor_E2EFlow FAILED
    org.opensearch.client.ResponseException: method [PUT], host [http://[::1]:40465], URI [/neuralsearch-bwc-testtextchunkingprocessor_e2eflow/_doc/0?refresh=true], status line [HTTP/1.1 404 Not Found]
    {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"}],"type":"index_not_found_exception","reason":"no such index [neuralsearch-bwc-testtextchunkingprocessor_e2eflow]","index":"neuralsearch-bwc-testtextchunkingprocessor_e2eflow","index_uuid":"QiuimceqQZK2U_KAzD_5cQ"},"status":404}
        at __randomizedtesting.SeedInfo.seed([9F3C01C234B5A374:52D0CFE8390977E0]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:385)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:355)

Both failing with this index_not_found_exception?

Yes!

@yuye-aws
Copy link
Member Author

I have created an issue: #690. Can we temporarily let go this issue?

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
@@ -99,4 +101,11 @@ protected void createPipelineForSparseEncodingProcessor(final String modelId, fi
);
createPipelineProcessor(requestBody, pipelineName, modelId);
}

protected void createPipelineForTextChunkingProcessor(String pipelineName) throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this identical to methods written above? Apart from this look good to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
@yuye-aws
Copy link
Member Author

The BWC test failure is not caused by this change.

@zane-neo zane-neo merged commit e69752c into opensearch-project:main Apr 17, 2024
62 of 68 checks passed
@vibrantvarun vibrantvarun added the backport 2.x Label will add auto workflow to backport PR to 2.x branch label Apr 22, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-661-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 e69752c3824e73e7cd2302b112e02c5091a8f096
# Push it to GitHub
git push --set-upstream origin backport/backport-661-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-661-to-2.x.

yuye-aws added a commit to yuye-aws/neural-search that referenced this pull request Apr 23, 2024
* bwc test for text chunking processor

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update changelog

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add test document for restart upgrade

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename pipeline configuration file

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter tests for lower versions

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index validate in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter bwc test for lower version

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* bug fix in document ingestion in text chunking test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* ensure index creation in text chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add comment

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update change log

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle comment format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename bwc test filename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file to filter tests

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* merge method createPipelineProcessorWithoutModelId

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* text chunking processor it: create pipeline method rename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix it failure

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* include index mapping for text chunking index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update nitpicking

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

---------

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
(cherry picked from commit e69752c)
yuye-aws added a commit to yuye-aws/neural-search that referenced this pull request Apr 23, 2024
* bwc test for text chunking processor

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update changelog

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add test document for restart upgrade

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename pipeline configuration file

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter tests for lower versions

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index validate in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter bwc test for lower version

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* bug fix in document ingestion in text chunking test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* ensure index creation in text chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add comment

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update change log

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle comment format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename bwc test filename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file to filter tests

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* merge method createPipelineProcessorWithoutModelId

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* text chunking processor it: create pipeline method rename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix it failure

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* include index mapping for text chunking index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update nitpicking

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

---------

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
(cherry picked from commit e69752c)
vibrantvarun pushed a commit that referenced this pull request Apr 24, 2024
* Test: bwc test for text chunking processor (#661)

* bwc test for text chunking processor

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update changelog

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* spotless apply

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add test document for restart upgrade

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename pipeline configuration file

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix pipeline create bug

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter tests for lower versions

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index create in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* index validate in chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* filter bwc test for lower version

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* bug fix in document ingestion in text chunking test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* ensure index creation in text chunking bwc test

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* add comment

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update change log

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle comment format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* rename bwc test filename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file format

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update gradle file to filter tests

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* merge method createPipelineProcessorWithoutModelId

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* text chunking processor it: create pipeline method rename

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* fix it failure

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* include index mapping for text chunking index setting

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

* update nitpicking

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

---------

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
(cherry picked from commit e69752c)

* remove unused import

Signed-off-by: yuye-aws <yuyezhu@amazon.com>

---------

Signed-off-by: yuye-aws <yuyezhu@amazon.com>
@yuye-aws yuye-aws deleted the Test/BWCTextChunking branch May 6, 2024 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants